Details of the Adjusted Rand index and Clustering algorithms
نویسندگان
چکیده
D be the number of pairs of objects that are placed in the same class in and in the same cluster in , E be the number of pairs of objects in the same class in but not in the same cluster in , F be the number of pairs of objects in the same cluster in but not in the same class in , and G be the number of pairs of objects in different classes and different clusters in both partitions. The quantities D and G can be interpreted as agreements, and E and F as disagreements. The Rand index [Rand, 1971] is simply H IKJ
منابع مشابه
خوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملSupplement to “Clustering Gene Expression Data with Repeated Measurements”
Cluster Accuracy: agreement with the functional categories Each entry shows the adjusted Rand index of the corresponding algorithm with the functional categories. The maximum adjusted Rand index of each row is shown in bold. The algorithms (rows) are sorted in descending order of the maximum adjusted Rand in each row. DIANA and single-link produce the least accurate clusters. *CAST did not conv...
متن کاملPerformance of an Ensemble Clustering Algorithm on Biological Data Sets
Ensemble clustering is a promising approach that combines the results of multiple clustering algorithms to obtain a consensus partition by merging different partitions based upon well-defined rules. In this study, we use an ensemble clustering approach for merging the results of five different clustering algorithms that are sometimes used in bioinformatics applications. The ensemble clustering ...
متن کاملانتخاب اعضای ترکیب در خوشهبندی ترکیبی با استفاده از رأیگیری
Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001